28 research outputs found

    Clusterfile: a parallel file system for clusters

    Get PDF

    The Paradis-Net API

    Get PDF

    Work in progress about enhancing the programmability and energy efficiency of storage in HPC and cloud environments

    Get PDF
    Proceedings of the First PhD Symposium on Sustainable Ultrascale Computing Systems (NESUS PhD 2016) Timisoara, Romania. February 8-11, 2016.We present the work in progress for the PhD thesis titled “Enhancing the programmability and energy efficiency of storage in HPC and cloud environments”. In this thesis, we focus on studying and optimizing data movement across different layers of the operating system’s I/O stack. We study the power consumption during I/O-intensive workloads using sophisticated software and hardware instrumentation, collecting time series data from internal ATX power lines that feed every system component, and several run-time operating system metrics. Data exploration and data analysis reveal for each I/O access pattern various power and performance regimes. These regimes show how power is used by the system as data moved through the I/O stack. We use this knowledge to build I/O power models that are able to predict power consumption for different I/O workloads, and optimize the CPU device driver that manage performance states to obtain great power savings (over 30%). Finally, we develop new mechanisms and abstractions that allow co-located virtual machines to share data with each other more efficiently. Our virtualized data sharing solution reduces data movement among virtual domains, leading to energy savings I/O performance improvements.European Cooperation in Science and Technology. COS

    Grid Computing und Peer-to-Peer Systeme. Seminar SS 2004

    Get PDF
    Im Sommersemester 2004 wurde im Seminar "Grid Computing und Peer-to-Peer Systeme" eine Reihe aktueller Themen aus den Grid Computing, Peer-to-Peer Systeme und Ad-Hoc Netzwerkem Gebieten angeboten. Jeder Teilnehmer wählte hieraus ein Thema, um darüber in der Form eines medial gestützten Vortrages zu referieren. Um allen Teilnehmern die Gelegenheit zu geben, aus diesem Seminar nachhaltig etwas mitzunehmen, fertigte jeder Vortragende eine allen zugängliche schriftliche Ausarbeitung an. Die Ausarbeitungen finden sich in leicht redigierter Fassung durch die Editoren im vorliegenden technischen Bericht wieder

    Making the case for reforming the I/O software stack of extreme-scale systems

    Get PDF
    This work was supported in part by the U.S. Department of Energy, Office of Science, Advanced Scientific Computing Research, under Contract No. DE-AC02-05CH11231. This research has been partially funded by the Spanish Ministry of Science and Innovation under grant TIN2010-16497 “Input/Output techniques for distributed and high-performance computing environments”. The research leading to these results has received funding from the European Union Seventh Framework Programme (FP7/2007-2013) under grant agreement number 328582

    Topology-Aware Data Aggregation for Intensive I/O on Large-Scale Supercomputers

    Get PDF
    International audienceReading and writing data efficiently from storage systems is critical for high performance data-centric applications. These I/O systems are being increasingly characterized by complex topologies and deeper memory hierarchies. Effective parallel I/O solutions are needed to scale applications on current and future supercomputers. Data aggregation is an efficient approach consisting of electing some processes in charge of aggregating data from a set of neighbors and writing the aggregated data into storage. Thus, the bandwidth use can be optimized while the contention is reduced. In this work, we take into account the network topology for mapping aggregators and we propose an optimized buffering system in order to reduce the aggregation cost. We validate our approach using micro-benchmarks and the I/O kernel of a large-scale cosmology simulation. We show improvements up to 15Ă— faster for I/O operations compared to a standard implementation of MPI I/O

    Surfing the optimization space of a multiple-GPU parallel implementation of a X-ray tomography reconstruction algorithm

    Get PDF
    The increasing popularity of massively parallel architectures based on accelerators have opened up the possibility of significantly improving the performance of X-ray computed tomography (CT) applications towards achieving real-time imaging. However, achieving this goal is a challenging process, as most CT applications have not been designed for exploiting the amount of parallelism existing in these architectures. In this paper we present the massively parallel implementation and optimization of Mangoose(++), a CT application for reconstructing 3D volumes from 20 images collected by scanners based on cone-beam geometry. The main contribution of this paper are the following. First, we develop a modular application design that allows to exploit the functional parallelism inside the application and to facilitate the parallelization of individual application phases. Second, we identify a set of optimizations that can be applied individually and in combination for optimally deploying the application on a massively parallel multi-GPU system. Third, we present a study of surfing the optimization space of the modularized application and demonstrate that a significant benefit can be obtained from employing the adequate combination of application optimizations. (C) 2014 Elsevier Inc. All rights reserved.This work was partially funded by the Spanish Ministry of Science and Technology under the grant TIN2010-16497, the AMIT project (CEN-20101014) from the CDTI-CENIT program, RECAVA-RETIC Network (RD07/0014/2009), projects TEC2010-21619-C04-01, TEC2011-28972-C02-01, and PI11/00616 from the Spanish Ministerio de Ciencia e Innovacion, ARTEMIS program (S2009/DPI-1802), from the Comunidad de Madrid

    Analyzing Power Consumption of I/O Operations in HPC Applications

    Get PDF
    Data movement is becoming a key issue in terms of performance and energy consumption in high performance computing (HPC) systems, in general, and Exascale systems, in particular. A preliminary step to perform I/O optimization and face the Exascale challenges is to deepen our understanding of energy consumption across the I/O stacks. In this paper, we analyze the power draw of different I/O operations using a new fine-grained internal wattmeter while simultaneously collecting system metrics. Based on correlations between the recorded metrics and the instantaneous internal power consumption, our methodology identifies the significant metrics with respect to power consumption and decides which ones should contribute directly or in a derivative manner. This approach has the advantage of building I/O power models based on a previous set of identified utilization metrics. This technique will be validated using write operations on an Intel Xeon Nehalem server system, as writes exhibit interesting patterns and distinct power regimes.The work presented in this paper has been partially supported by the EU Project FP7 318793 “EXA2GREEN” and partially supported by the EU under the COST Programme Action IC1305, “Network for Sustainable Ultrascale Computing (NESUS)” and by the grant TIN2013-41350-P, Scalable Data Management Techniques for High-End Computing Systems from the Spanish Ministry of Economy and Competitiveness.European Community's Seventh Framework Progra
    corecore